予測報酬誤差(reward-prediction error)
強化学習(Reinforcement Learning; RL)
TD learning
割引率(discount factor)
ドーパミン(Dopamine)